Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract BackgroundMany approaches have been developed to overcome technical noise in single cell RNA-sequencing (scRNAseq). As researchers dig deeper into data—looking for rare cell types, subtleties of cell states, and details of gene regulatory networks—there is a growing need for algorithms with controllable accuracy and fewer ad hoc parameters and thresholds. Impeding this goal is the fact that an appropriate null distribution for scRNAseq cannot simply be extracted from data in which ground truth about biological variation is unknown (i.e., usually). ResultsWe approach this problem analytically, assuming that scRNAseq data reflect only cell heterogeneity (what we seek to characterize), transcriptional noise (temporal fluctuations randomly distributed across cells), and sampling error (i.e., Poisson noise). We analyze scRNAseq data without normalization—a step that skews distributions, particularly for sparse data—and calculatepvalues associated with key statistics. We develop an improved method for selecting features for cell clustering and identifying gene–gene correlations, both positive and negative. Using simulated data, we show that this method, which we call BigSur (Basic Informatics and Gene Statistics from Unnormalized Reads), captures even weak yet significant correlation structures in scRNAseq data. Applying BigSur to data from a clonal human melanoma cell line, we identify thousands of correlations that, when clustered without supervision into gene communities, align with known cellular components and biological processes, and highlight potentially novel cell biological relationships. ConclusionsNew insights into functionally relevant gene regulatory networks can be obtained using a statistically grounded approach to the identification of gene–gene correlations.more » « less
-
In animal models,Nipbldeficiency phenocopies gene expression changes and birth defects seen in Cornelia de Lange syndrome, the most common cause of which isNipblhaploinsufficiency. Previous studies inNipbl+/−mice suggested that heart development is abnormal as soon as cardiogenic tissue is formed. To investigate this, we performed single-cell RNA sequencing on wild-type andNipbl+/−mouse embryos at gastrulation and early cardiac crescent stages.Nipbl+/−embryos had fewer mesoderm cells than wild-type and altered proportions of mesodermal cell subpopulations. These findings were associated with underexpression of genes implicated in driving specific mesodermal lineages. In addition,Nanogwas found to be overexpressed in all germ layers, and many gene expression changes observed inNipbl+/−embryos could be attributed toNanogoverexpression. These findings establish a link betweenNipbldeficiency,Nanogoverexpression, and gene expression dysregulation/lineage misallocation, which ultimately manifest as birth defects inNipbl+/−animals and Cornelia de Lange syndrome.more » « less
-
Chronic myeloid leukemia (CML) is a blood cancer characterized by dysregulated production of maturing myeloid cells driven by the product of the Philadelphia chromosome, the BCR-ABL1 tyrosine kinase. Tyrosine kinase inhibitors (TKIs) have proved effective in treating CML, but there is still a cohort of patients who do not respond to TKI therapy even in the absence of mutations in the BCR-ABL1 kinase domain that mediate drug resistance. To discover novel strategies to improve TKI therapy in CML, we developed a nonlinear mathematical model of CML hematopoiesis that incorporates feedback control and lineage branching. Cell–cell interactions were constrained using an automated model selection method together with previous observations and new in vivo data from a chimericBCR-ABL1transgenic mouse model of CML. The resulting quantitative model captures the dynamics of normal and CML cells at various stages of the disease and exhibits variable responses to TKI treatment, consistent with those of CML patients. The model predicts that an increase in the proportion of CML stem cells in the bone marrow would decrease the tendency of the disease to respond to TKI therapy, in concordance with clinical data and confirmed experimentally in mice. The model further suggests that, under our assumed similarities between normal and leukemic cells, a key predictor of refractory response to TKI treatment is an increased maximum probability of self-renewal of normal hematopoietic stem cells. We use these insights to develop a clinical prognostic criterion to predict the efficacy of TKI treatment and design strategies to improve treatment response. The model predicts that stimulating the differentiation of leukemic stem cells while applying TKI therapy can significantly improve treatment outcomes.more » « less
-
null (Ed.)The haematopoietic system has a highly regulated and complex structure in which cells are organized to successfully create and maintain new blood cells. It is known that feedback regulation is crucial to tightly control this system, but the specific mechanisms by which control is exerted are not completely understood. In this work, we aim to uncover the underlying mechanisms in haematopoiesis by conducting perturbation experiments, where animal subjects are exposed to an external agent in order to observe the system response and evolution. We have developed a novel Bayesian hierarchical framework for optimal design of perturbation experiments and proper analysis of the data collected. We use a deterministic model that accounts for feedback and feedforward regulation on cell division rates and self-renewal probabilities. A significant obstacle is that the experimental data are not longitudinal, rather each data point corresponds to a different animal. We overcome this difficulty by modelling the unobserved cellular levels as latent variables. We then use principles of Bayesian experimental design to optimally distribute time points at which the haematopoietic cells are quantified. We evaluate our approach using synthetic and real experimental data and show that an optimal design can lead to better estimates of model parameters.more » « less
An official website of the United States government
